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1.  The  LEAP  Language 


l.a.  Introduction 

Associative  memory,  the  accessing  of  data  through  a  partial 
specification  of  its  contents,  has  long  been  a  subject  of  interest  in 
Computer  Science.  Although  it  is  no  longer  considered  a  panacea,  asso¬ 
ciative  memory  has  proved  to  be  of  great  value  in  artificial  intelligence, 
operating  systems  and  computer  aided  design  and  is  a  major  aspect  of  in¬ 
formation  retrieval.  Various  proposals  for  building  large  hardware 
associative  memories  have  not  materialized  [12]  and  essentially  all  the 
applications  to  date  have  depended  on  software  schemes  of  some  sort.  Tji 
this  paper,  we  present  a  programming  language  for  software  associative 
memory  systems  and  describe  a  particular  scheme  which  seems  uo  have  several 
nice  properties. 

There  are  two  basic  problems  in  designing  any  programming  system: 
ease  of  use  and  efficiency  of  execution.  In  this  section  we  discuss  a 
programming  language  which  users  have  found  convenient  for  associative 
processing;  Section  2  describes  an  implementation  of  the  language  which 
is  quite  efficient  over  a  range  of  problems. 

The  language,  I£AP,  is  an  extension  of  ALGOL  [22]  to  include 
associations,  sets  and  a  number  of  auxiliary  constructs.  The  problem  of 
describing  a  programming  language  in  a  Journal-size  article  is  quite 
difficult,  especially  if  the  system  contains  new  ideas.  The  problem  is 
further  exacerbated  in  this  case  by  the  fact  that  LEAP  is  really  a  family 
of  languages,  each  adding  a  different  set  of  features  to  the  ALGOL  base. 
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Besides  the  constructs  described  here  there  are  forms  of  LEAP  with  matrix 
operations,  property  sets,  and  on-line  graphics  [29/  JO] .  Even  the  associa¬ 
tive  language  lias  two  compilers:  one  whir’i  docs  dynamic  type  checking  at 
execution  time  and  another  (described  here)  which  does  not.  This  notion 
of  fluid  ianguuge  specifications  runs  against  the  current  emphasis  on 
standardized  languages  and  will  he  considered  further. 

The  fluidity  of  the  LEAP  design  is  largely  due  to  its  implementa¬ 
tion  by  means  of  the  translator  writing  system,  VITAL  [9>  20].  Our  varia¬ 
tion  of  ALGOL  (also  described  in  [20])  was  purposely  designed  with  possibi¬ 
lity  of  being  extended,  and  various  applications  groups  have  tailored  it 
to  their  needs.  With  such  a  system  and  a  good  linkage  editor,  there  seems 
to  be  no  advantage  to  standardization.  We  view  the  design  of  better  uni¬ 
versal  language  like  the  design  of  better  crutches  —  not  of  direct  interest 
to  those  possessing,  complete  facilities. 

This  ets.itude  toward  programming  languages  has  also  influenced  the 
style  of  this  puper.  Rather  than  attempt  to  spell  out  in  detail  every  nuance 
of  the  language,  we  have  concentrated  on  the  major  details.  Although  LEAP 
is  ever,  less  context-free  than  other  programming  languages,  we  have  expressed 
the  form  of  constructs  in  Raekus-Naur  Form.  Tne  sections  on  semantics  and 
pragmatics,  and  the  examples  serve  to  delimit  the  syntactically  correct  con¬ 
structs  which  are  meaningful .  The  complete  formal  description  of  LEAP  in 
the  rotation  c.f'  VITAL  (cf.  [  9j  )  is  available  from  the  authors.  This  descrip¬ 
tion  doer,  not  include  tire  data-st ructure  implementation  which  was  done  sepa¬ 
rately  and  is  described  in  Section  2.  A  translator  writing  system  capable 
of  unto’i  .it  i  up.  duta-s  tructure  design  would  be  a  significant  contribution  to 
the  field  l  JO]. 
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Many  of  the  ideas  in  the  associative  language,  LEAP,  have 
occurred  before.  If  we  consider  an  association  to  be  a  triple: 


attribute  of  object  is  value 

we  see  that  the  earliest  list  processing  systems  [18,  23]  included  some 
associative  processing.  By  considering  an  associative  structure  as  a 
colored,  directed  graph  we  can  relate  our  work  to  string  and  tree  manipu¬ 
lation  languages.  Both  these  developments  have  been  recently  surveyed 
in  [3].  Another  field  in  which  associative  structures  have  played  a 
central  role  is  information  retrieved  [5].  Sets  have  appeared  in  simu¬ 
lation  languages  [32]  among  others. 

The  LEAP  system  is  a  continuation  [7,  26,  27]  of  our  earlier  work 
on  associative  processing.  The  new  problems  attacked  were  the  development 
of  a  convenient  language  for  handling  complex  collections  of  associations 
and  the  extension  of  the  hash-coding  structure  techniques  to  a  paging 
system.  We  were  also  interested  in  the  implications  of  LEAP  as  a  non¬ 
procedural  language;  the  sequencing  of  associative  retrieval  statements 
is  decided  by  the  translator  and  the  complexity  of  this  process  (cf.  [13]) 
portends  a  diffir  lit  future  for  non-procedural  systems.  The  LEAP  system 
has  been  running  on  the  Lincoln  Laboratory  TX-2  since  early  1967  and  has 
been  used  in  a  number  of  applications,  some  of  which  will  be  discussed  in 
Section  3. 
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l.b.  Data  types 

The  basic  objects  of  manipulation  in  the  system  are  the  item 
and  an  ordered  triple  of  items  called  an  association,  a  relation  or 
just  a  triple.  In  addition  to  items,  we  introduce  the  auxiliary  notions 
of  set,  itemvar,  and  local  variables. 


l.b.l.  Syntax 

<simple  typo  real  |  integer  |  boolean. 

<slgebraic  typo  ::«=  <simple  type>  |  <simple  typo  §£jray 

<leap  typo  : :  =  item  |  ifcemvar  |  local 

<type>  : :  =  <algebraic  typo  |  <leap  typo  |  ^set  | 

< algebraic  typo  cleap  typo 


l.b. 2.  Semantics 


A  variable  of  algebraic  type  behaves  exactly  like  an  ALGOL  variable. 

An  item  is  a  symbol,  like  a  LISP  f 16]  atom,  which  can  be  manipulated 
by  the  system.  If  an  item  is  declared  with  an  algebraic  type,  it  will 
also  have  an  associated  datum  of  that  type.  Tnus  there  are  some  simple 
variables  which  can  be  used  only  algebraically,  some  which  can  be  used 
only  symbolically  and  some  which  can  be  used  in  either  manner.  A  set  is 
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an  unordcred  collection  of  iteins  containing  at  must  one  occurrence  of 

- 1 

any  item. 

'V'W^ 

An  itemvar  is  a  variable  whose  value  is  an  item  (it  is  a  reference 

UVvv  • — — -  1  ■  ■  — 

[36,  57)  to  an  item).  A  local  1b  an  itemvar  which  obeys  special  binding 

,rWV«  v-~'v 

rules  and  is  used  only  in  a  restricted  set  of  contexts  (of.  Section  l.d; 
the  earliest  use  of  locals  was  in  the  string  processing  languages  [ 5) ) . 

If  an  itemvar  (local)  is  declared  to  have  an  algebraic  type,  the  item  which 
is  the  value  of  the  itemvjy  (lojjjl)  is  assumed  to  have  a  datum  of  that  type 
in  algebraic  operations. 

l.b.3.  Pragmatics 

The  division  of  variables  into  algebraic,  symbolic  and  combination 
types  is  for  efficiency  reasons.  The  combination  variable  requires  twice 
as  much  space  as  either  of  the  others  and  produces  slower  code  than  the 
algebraic  variable.  Sets  are  implemented  by  a  linked  list  ordered  by  the 
internal  code  of  the  items j  this  allows  the  system  to  carry  out,  e.g.,  set 
comparisons  in  time  proportional  to  the  sum  of  the  size  of  the  sets  rather 
than  their  product. 

There  are  a  variety  of  ways  to  organize  levels  of  reference  within 
a  programming  system.  LISP,  for  example,  allows  any  level  of  referencing, 
but  gives  the  user  the  responsibility  for  keeping  things  straight.  ALGOL 
is  essentially  a  uniform  single-level  system.  In  ALGOL  68  [33],  the  system 
accounts  for  any  level  of  reference,  but  imposes  rigid  conventions  on  their 
use.  The  system  adopted  here  provides  three  levels:  itemvars  and  locals, 
items,  and  algebraic  data.  More  elaborate  referencing  structures  can  be 
built  using  associations,  but  are  the  user's  responsibility. 
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l.c.  Express  ions 

The  additional  expressions  required  lor  LEAP  are  divided  into 
those  which  yield  algebraic  values  and  those  whose  values  are  leap  types. 


l.c.l.  Syntax 

additional  algebraic  expression: : ■=  datum(<item  expression)! 

count(<set  expression) 

additional  boolean  expression ::  =<item  expression  €  <set  expression  | 

<set  expression  C  <set  expression  | 

<set  expression  =  <set  expression  | 
i 8 triple  (<item  expression) 

<item  expression:'^  <item  I  Citemvan  I  <local>  I  (<triple>)  I  any  I 

new  (<algebralc  expression)  |  <selector>  <item  expression 
<set  expression :  :=  {5  |  <set>  |  {citerr,  expression  list>]  | 

<set  expression  <set  operaton  <set  expression  | 

<item  expression  Associative  operaton  <associative 

expression 

<associative  expression:  :=  <item  expression  |  <set  expression 
<tripld>: :=  <item  expression  •  Associative  expression  =  <associative 

expression 

<set  operaton  ::■=  H  |  U  |  — 

<associative  operaton  ::«=  >  |  '  |  * 
cselecton::-  first  |  second  |  third 
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examples :  count.  (u.ve. )  -  aafni.i  (i  ill)  -*  > 

{  hill,  t oik  )  c:  (non:,  IJ  father kh  n) 

!i, lmhor  •  (pari  ■  ha..!  l'i ngc - )  ;  mw  (',>) 


l.c.2.  Semantics 

Any  <ilem  expression’-  yields  an  it. cm  as  its  value.  If  the  item 
has  an  associated  algebraic  datum  then  datum  will  yield  its  value,  other- 
wise  datum  is  undefined.  Any  vset  expression^  yields  a  set  as  its  value. 

The  value  of  count  is  the  number  o;  items  (>  0)  in  the  set.  The  boolean 
relations,  member  (C),  route.  >  n«.d  (c)  and  set  equality  (-)  have  their 
usual  meanings.  The  predicate:  i  strife  is  true  if  its  argument  evaluates 
to  an  item  which  is  a  parcnt.he.iiy.ed  triple  (el.  next  paragraph). 

Either  an  itemvar  or  a  local  can  be  used  wherever  an  item  car.  There 
is  a  forcing  convention  [>•■]  which  uses  the  value  (an  .^em)  of  the  local 
or  itemvar  in  these  cases.  The  xoserve-J  item  arvv  can  only  appear  in  triples 
which  are  not  operands  of  me*  <.  (of.  l.d.2).  The  unary  operator  new  causes 
cL  uCVi"  i  uCIi.  Oj'  l.l*i  C:  "lypt:  vT»i  -l/h"  *.t;  i  ^.u/i  ei  ;  Ca  J.'l  iuil>  .  A  1.1'ip.tC  CncX^S^G 
in  parentheses  is  also  an  <item  expression;  a  position  (first,  second, 
or  third)  cf  such  an  <ite»:  expression  can  be  accessed  by  a  <iselecto.>. 

example: 

number  •  (part  •  hand  finger)  =  new  (b) 

- - —  —  - -  r** r> 

In  the  construction  of  this  triple,  the  system 

1  )  forms  a  triple:  p_;e_t  •  hand  1 1  n~ci 

2)  creat.es  an  item  from  the  triple  in  1) 

a)  creates  a  new  iuteg'r  itc-m  ii.i  tialized  to  5 

*"  '.■Ivv.!.,... 

b )  forme  tl-.e  triple  from  nu;  .l--.:.  and  the  items  of  2)  and  j) 


c;s:,  he  u: .  ici':'  a  d«nvl  at  >.•;  .sol  .  tin1  ii-.piy 


act.  4  or  ■<  bi.iv'. 


-  . i  .1.. 


iw.>  .  Tii'j  act  operations 


of  union,  intersect  loi.  a:..)  ..it  d :  i'i\  >  eni't  r.i*o  deb i n_-d  in  the  usual  vay 
i'iit.  al  le!  iiciL ;  v  i  <  Ucii.  t  >.:•»  vk.';  Jos  •  ive  operator  <nsso- 


ciutivi  expi't.j. 


'..■xp/'i  r.;i  i Uil:'  is  r.pt.  ci  f ied  ir.  terns  of  the 


Ull  j  Vl  •I'-  i1  ■ 


'  tw  :  ojier.miir.  hoth  <iter  c-xt-ress i e.»j> c 


with  values  P  and  Q  l.iivii 


P  •  Q  i  a  Vi •:  so*.  of  i  !.*  ?X }  such  i  hat  P  •  Q  -  X 

is  in  the  universe  of  triples. 

1'  '  Q  if.  ti.f-  O!  i  ft  IX}  such  tout  P  •  >  '•/ 

is  it,  the  u;j  j  verso  of  triples. 

P  *  Q  if  (P  •  Q)  U  O'  ’  Q) 


I  I*  '!  ii  I. 


■  it  as  it:-  value,  1  ho  value  of 


j".'  f.l:.' 


:  ci,"  ull  i._  I.'.  f ;  I"i'i ,  }  <;i  »n-  c:-.- fl  miuloisou::]; 


•  T  is .  ? .  i  :  v  I  ..’u.,.';  11  ..4vt  ;.  U  !  It.  J  iij*.  JOJJO  j  Fj^' 


state J..u;t:  ii.  .•’.eel i 


To.  u:  ■  o:  P 


!'l-  a-jd  :  :‘ip,  ju/.-uht- -  fcivinp  convent  it 


to  tiie  vf.'Ui.  1  u O:  ; 


to  i:...'lu  J-  !'it  W:.".  nz-anf  ‘  i  t:-l  c .  - 


J  i  ;■  '  j  -I  i !,-  ■  v-  i 


effect i vc:  U3i  of  LiAP.  "'lie  iin- 


pj  ui.e.-u  it'- :  o:.  o:  i. 


!  .*•  ti.  ... half  of  this  paper; 


...  in.'luiJ.-  the  various 


fPI)^W|iPillt|!f|!|ll|>|i"llllliii)|i|ii  . . eiw -  .*• w 


l.d.  Statements 


There  are  relatively  few  new  statement  types  in  LEAP  because 
the  control  structure  of  ALGOL  was  largely  adequate  for  our  purposes. 


l.d.l.  Syntax 

<additional  statement ::=  <set  statement  \  Associative  statement 

<loop  statements 

<set  statements : : =  put  <item  expressions  in  <set>  | 

remove  <item  expression  from  <set>  j 
<sett>  •-  <set  expression 

<associative  statements : :  «=  make  <triple>  I  erase  <triple>  I 

delete  <item  expression  I 

fl/VW-  ’ 

<itemvan  -  <item  expression 

<loop  statements:  :  =  foreach  <local>  ln<set  expression  <Jio_<statement>  | 
foreach  Associative  context>  do  <statement> 
<aasociative  context>:  :=  <element>  I  Associative  contexts  and  <eiement> 
<eleraent>j : ■  <admissable  triplO  ]  <boolean  expression  j 


<local>  in  <set  expression 


examples: 


put  tons  in  sons 

a  on  s  *-  yon  e  U  (tom] 

mpke  father  •  tom  =  bill 

foreach  jf.  irs  sons  d_o  d^um(x)  •-  datum(x)+2 

foreach  father  •  x  e  bill  do  put  x  in  sons 

Vw  ^  ►•a.  - 
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l.d.2.  Semantics 


The  set  assignment  statement  copies  the  value  of  the  <set  ex¬ 
pression  and  assigns  the  result  to  the  set  variable.  The  gut  (rejjigve) 
operation  is  simply  a  more  efficient  way  of  doing  union  (subtraction) 
of  a  single  element.  The  qyjke  (era^)  statements  cause  a  new  triple  to 
be  added  to  (subtracted  from)  the  universe  of  triples  (cf.  Section  2.b. 
for  details).  An  ^tem  which  was  created  by  a  new  expression  can  be  des¬ 
troyed  by  delete .  The  internal  identifier  associated  with  this  item  will 
be  reassigned  and  it  is  the  user's  responsibility  to  assume  that  there  ere 
no  uses  of  a  deleted  item. 

The  Clpop  statements  are  the  raison  d'etre  for  the  entire  system 
and  will  be  considered  in  some  detail.  The  first  alternative  describes 
the  loop  over  the  elements  of  a  <set  expression.  The  <set  expression 
ia  evaluated  once  and  the  <statement>  is  executed  once  for  each  member 
of  the  resulting  set.  The  local  is  the  loop  variable  and  is  assigned  the 
successive  elements  of  the  set;  it  is  treated  as  an  itemvar  within  the 
<statement> . 

The  loop  statement  over  an  Associative  context>  i6  much  more 
complicated.  An  associative  context  determines  a  set  of  simultaneous 
relational  equations  which  the  system  must  solve  to  determine  the  values 
of  the  loop  variables .  A  loop  variable  is  a  Jocal  appearing  in  the  asso¬ 
ciative  context  which  has  not  been  bound  in  an  enclosing  loop.  The  set  of 
values  for  each  loop  variable  is  determined  once  at  the  beginning  of  the 
loop  and  the  <statements>  executed  repeatedly  with  the  loop  variables  bound 
to  each  value  in  turn.  The  construct  <admissable  triple>  is  meant  to  convey 
the  fact  that  certain  ayntactically  correct  <triple>s  are  not  meaningful 
in  an  Associative  context.  The  compiler  detects  the  following  cases. 
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a)  the  triple  contains  no  unbound  local 

b)  the  triple  contains  three  unbound  locals 

c)  the  unary  operator  new  appears  in  a  triple. 

The  <t>oolean  expression>  and  <set  expression>  elements  of  an  associative 
context  serve  to  restrict  the  set  of  values  which  can  be  assigned  to  a 
loop  variable.  We  will  discuss  the  processing  of  a  loop  statement  with 
the  aid  of  examples: 

1)  foreach  x  in  sons  do  datura(x)  <-  datura(x)+2 

*"■  iw  J "  'V'-**’  ^Vvvv  —  l/V'av** 

This  is  a  loop  over  a  <set  expression>  which  is  simply  a  set  $ons. 

If  jc  has  been  declared  integer  local,  then  the  effect  of  1)  will  be  to 
increase  by  2  the  datum  associated  with  each  item  in  sons  . 

2 )  f^ea^h  father  •  x_  s  bill  d^o  pjut  x  in  son3 

We  assume  that  father  and  Dill  are  items,  x‘  is  a  local  and 
sons  is  a  set;  the  algebraic  type  of  any  of  these  variables  is  irrelevant. 
The  Bystem  first  confutes  the  set  of  items  (o^, ...  o^}  which  appear  in 
a  triple: 

father  •  =  bill  . 

Mow  the  body  of  the  loop  is  executed  times,  once  with  each  as  the 
value  of  x  (considered  to  be  an  itemvar  within  the  body).  If  sons  was 
initially  empty  it  will  be  precisely  the  set  {o^, . . .  o^}  at  the  end  of 
the  execution  of  2).  If  k=o,  the  body  of  the  loop  is  not  executed. 

In  this  simple  case  the  loop  statement  could  be  replaced  by 

2)  eons  «-  sons  U  father 1  bill  . 

3)  foreach  father  •  x  e  bill  and  sex  •  x  =  male  do  Put  x  in  sons 

-  —  — -  *  - • —  .  VU  flit. - 

This  statement  is  much  the  same  as  2)  exoept  that  the  set  of  values 
for  x  is  made  up  of  these  items  satisfying  both  associations.  This 
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keeps  bill's  daughters  out  of  the  set  sons .  Notice  that  the  order  in 
which  the  two  relations  are  processed  can  be  expected  to  have  a  marked 
effect  on  performance,  le.  there  are  probably  many  more  males  than 
children  of  bill.  This  situation  will  be  discussed  in  Section  J. 

4)  (a)  foreach  father  •  father  •  x  e  z  do  make 

V/'VVHv  •  ■  - -  —  - *  Kv  VUv^ 

grandad  ■  _x  E 

(b)  foreach^  father  •  jc  e  y  a ^  father  *  £  E  _z  d^ 

make  grandad  •  x  E  z  . 

jA-vk  - ■  • —  — 


The  statements  (a)  and  (b)  above  are  entirely  equivalent,  the 
compiler  actually  will  transform  (a)  into  (b)  supplying  a  dummy  local. 
Assuming  x,y,g  are  unbound  locals,  the  statement  (b)  requires  solving 
for  three  loop  variables .  It  is  clearly  inadequate  to  solve  for  each 
local  independently;  the  system  must  form  a  n- tuple  of  items  (a  corres¬ 
pondence)  which  satisfies  the  associative  context.  In  this  case  the  n- 
tuple  Is  an  3- tuple  x,y, 2.  The  correspondences  are  computed  in  advance 
and  the  body  of  the  loop  executed  once  for  each  correspondence .  For  ex¬ 
ample  if  the  triples  of  the  universe  were 


father  •  tom  ='  bill 

father  •  pete  s  tom 

father  •  bill  =  don 

father  •  george  2  Clyde 

the  statement  (b)  would  ylel .  correspondences 


tom,  bill,  don 
pete,  tom,  bilL 
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and  after  the  execution  of  (b)  the  universe  would  also  contain  the 
triples 


grandad  •  tom  5  don 
grandad  •  pete  e  bill 


l.d.3.  Pragmatics 

The  implementation  of  sets  was  discussed  briefly  in  Section  l.b.3* 
The  implementation  of  make  and  erase  depend  heavily  on  the  structure  and 
are  discussed  in  Section  2.b,The  loop  statement  over  <set  expression  is 
actually  a  special  case  of  the<associative  context>loop,  but  is  implemented 
separately  for  efficiency.  The  associative  loop  is  by  far  the  most  complex 
construct  in  LEAP.  The  formation  of  correspondences  entails  building  many 
partial  correspondences  which  later  must  be  discarded.  This  is,  in  itself, 
a  significant  data  structure  problem  and  has  been  implemented  with  a  hash- 
coded  triple  scheme  associating  a  correspondence  number,  a  local,  and  an 
item.  This,  combined  with  the  order  of  processing  considerations  mentioned 
in  example  3  above  make  the  compilation  of  the  associative  loop  statement 
decidely  non- trivial.  This  problem  is  discussed  briefly  in  Section  2  and 
at  greao  length  in  Hilbing's  dissertation  [13]. 


? 


1.9.2. 


begin  cement  this  program  will  determine  if  mary  is  related 
to  Joe  by  virtue  of  the  fact  that  mary's  paternal  aunt 
is  married  to  Joe's  paternal  uncle  and,  if  so,  will 
record  that  fact; 

'BJJS 

item  f athcr, mary, joe, sex, married, male, related, reason; 
local  x,y; 
set  uncles; 

IV-  - - 

boolean  switch; 

uncles  <-  pi  switch  «-  false; 


foreach  father yx  s  father  •  father  •  Joe  and 
sex  •  x  •  =  male  do  put  x  in  uncles ; 
foreach  father'  father  •  father  •  mary  b  x  and 
married*(x)  =  y  and  y  in  uncles  do 

•  —  —  r*  ■  ■ 

begin 


make  related  •  mary  s  joe; 
mijke  reason  •  (related  •  mary  s  Joe) 
e  (married  •  x  s  y); 


switch 


true; 

\A*-\ 


end; 


write  (switch) 

VVtA.  1 


end 

Ym 
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2.  The  LEAP  data-structure 

The  moat  important  construct  in  LEAP  is  the  ^associative  context> 
which  implicitly  specifies  a  collection  of  ITEMS.  In  this  section  we 
present  a  rather  elaborate  two- level  storage  scheme  which  was  designed 
to  facilitate  the  processing  of  associative  contexts.  The  problem  of 
retrieving  the  objects  from  memory  which  are  related  to  a  given  object 
has  been  a  central  concern  in  list  processing  systems  since  their  inception. 

Ihe  early  list-processing  languages  (e.g.  LISP  [18],  IPL  [23])  achieved 
a  form  of  associative  memory  through  the  use  of  property  (description) 
lists.  This  enabled  one  to  specify  the  name  of  a  property  (attribute) 
and  an  object  from  which  the  system  would  retrieve  the  value  or  list  of 
values.  This  tfacilitj  freed  the  user  fran  remembering  which  ordinal  number 
he  had  mentally  associated  with  &  property.  The  property  list  was  imple¬ 
mented  as  a  list  of  (property,  value)  pairs  linked  to  the  object.  When  a 
retrieval  was  to  be  made,  the  system  would  search  the  property  list  of  the 
spmlcified  object  for  the  specified  property  and  would  return  the  associated 
value.  Although  this  feature  of  list  processing  systems  is  heavily  used, 
there  are  sericsua  problems  with  it  fran  both  a  usage  and  an  implementation 
standpoint .  All  of  the  more  recent  attempts  to  produce  associative  retrieval 
on  conventional  computers  may  be  viewed  as  attempts  to  solve  one  or  both 
of  these  problems. 

The  problem  with  property  lists  from  a  user's  point  of  view  was  that 
they  are  one-way.  If  one  has  stored  "8CN(J0HN)=DCN"  there  is  no  direct  way 
to  find  the  X  such  that  "SCK(X)=DCN".  This  problem  became  particularly 
bothersome  in  computer  graphics  and  led  to  the  development  ol  languages  such 


as  L6  [153#  CORAL  [53,  34]  AED  [28],  APL  [7]  and  ASP[l6]  which  have 
recently  been  surveyed  (38).  These  languages  automatically  provided 
two-way  associations  and,  for  reasons  described  in  the  next  paragraph, 
replaced  the  property  list  with  a  block  (record)  of  continuous  storage. 

The  abandonment  of  the  property  list  was  a  victory  for  economy  over 
flexibility.  The  property  list  requires  two  cells  per  association  and 
requires  searching  at  each  retrieval.  The  idea  of  using  a  continuous 
block  of  storage  for  the  property  list  is  not  generally  attributed  to 
Ross  [?7 3 •  The  idea  is  simple:  by  assigning  a  small  integer  to  each 
attribute  (e.g.  SON  ~  5)  one  can  achieve  rapid  retrieval  and  still  only 
use  one  cell  per  association.  Tne  usual  implementation  is  to  place  the 
address  of  the  top  of  the  block  (internal  name  of  the  object)  in  an  index 
register  and  assemble  a  load  or  store  instruction  having  that  index  field 
and  having  the  attribute  number  in  the  address  field.  This  scheme  is 
efficient  and  has  been  used  in  a  great  variety  of  systems,  but  it  too, 
has  drawbacks. 

The  first  difficulty  is  that  the  size  of  the  continuous  blocks  must 
be  specified  in  advance.  If  a  new  attribute  is  applied  to  an  object  at 
least  a  partial  recompilation  is  required.  A  further  difficulty  occurs 
in  scheduling  the  numbers  assigned  to  each  attribute;  there  are  three 
•approaches  to  this  problem: 

a)  Eliminate  symbolic  attribute  names;  this  requires  the  user 
to  remember  the  relative  location  of  each  attribute,  but 
does  not  require  any  extra  space5[34]  and  to  some  extent  [7). 

b)  Require  that  the  type  of  block  resulting  from  every  asso- 

.•1  tive  retrieval  be  ccmputable  at  translation  time  [36,  37]. 
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This  need  not  always  "  extra  space  hut  mokes  the 

system  impractical  for  the  type  of  applications  considered 
here.  For  example,  the  AI’L  system  [7]  was  developed  because 
of  the  inadequacy  of  PL/l  for  computer  graphics. 

c)  Assign  a  unique  computation  to  each  attribute  name.  This 
allows  an  attribute  to  apply  to  any  type  of  object,  but 
impose  a  scheduling  problem  on  the  user  which  usually  re¬ 
sults  in  wasted  space  [1^]. 

In  addition,  all  of  these  schemes  share  the  problem  that  the  entire  block 
of  storage  for  an  item  must  be  allocated  on  the  first  use  of  that  item. 

The  data  structure  schemes  discussed  above  were  designed  for  problems 
for  which  the  associative  structure  rather  simple  and  is  contained  in  main 
memory.  More  recently,  there  have  been  attempts  to  extend  these  systems 
to  a  paging  environment  [2,  b,  7].  The  only  previous  v/ork  on  system  ca¬ 
pable  of  handling  ali  seven  associative  primitives  (cf.  Section  J.b)  and 
large  data  bases  have  been  in  query  languages  and  information  retrieval  [6]. 
The  thrust  of  these  efforts  has  been  rather  different,  they  have  generally 
emphasized  data  bases  too  large  for  secondary  storage  and  have  allowed  for 
much  slower  response  rates. 

2.b.  „ne  hash-coded  associative  memory  scheme  presented  here  ia  an 
attest  to  solve  the  associative  processing  problem  for  a  large  range  of 
applications.  Hash-coding  (scrambling)  is  a  well  known  technique  for 
processing  symbol  tables,  etc.  [17,  ly]  and  there  was  early  speculation  [21] 
that  hash-coding  could  lead  to  efficient  associative  processing.  The  par- 
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ticular  scheme  described  here  is  an  extension  of  our  earlier  work  [8, 

4>  yy,  JO]  which  is  designed  to  work  efficiently  in  a  particular  time¬ 

sharing  system  [11]. 

The  first  requirement  that  we  place  on  an  associative  memory  scheme  is 
that  it  be  capable  of  answering  the  seven  retrieval  requests  obtained  by 
substituting  zero,  one  or  two  variables  into  the  association. 

ATTRIBUTE  •  OBJECT  =  VALUE  . 

This  requirement  is  an  obvious  extension  of  two-way  links  and  has  been  very 
useful  in  practice.  The  second  major  requirement  is  that  any  attributes, 
objects  and  values  may  be  combined  and  that  an  association  can  itself  be 
used  as  an  item.  The  third  requirement  is  that  the  time  to  retrieve 

an  association  should  be  as  small  as  possible  and  should  be  largely  inde¬ 
pendent  of  the  total  nu  '  of  associations.  A  further  requirement  for 
the  system  described  here  is  that  ii.  perform  as  well  as  possible  using 
secondary  storage  within  the  rules  of  the  time- sharing  system. 

The  problem  is  to  represent  a  universe  of  associations  (triples, 
relations,  facts)  of  ITEMs  so  as  to  meet  the  requirements  of  the  preceding 
paragraph.  A  dictionary  phase  converts  each  ITEM  to  a  unique  integer  so 
we  can  consider  a  universe  of  triples  of  integers  (A, 0,V).  The  seven 
primitive  retrieval  operations  to  be  implemented  are 

1)  (A,  0,  V) 

2)  (A,  0,X) 

3)  (A,X,V) 

U)  (x,0,v) 

5)  (A,X,Z) 

6)  (X,0,Z) 

7)  (x,z,v) 
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where  X,  Z  denote  unspecified  iJuaiLxen.s  (variables). 

There  are  the  additional  problems  that  each  variable  may  be  multi¬ 
valued  and  that  a  given  primitive  may  assign  values  to  two  variables. 
For  example,  if  the  universe  were 

(SON,  DOT,  JOHN) 

(SOT,  DOT,  JOE) 

the  following  answers  would  result  from  primitive  questions. 


primitive  question 

result 

(SOT,  DOT,  JO’ IN) 

(SOT,  DOT,  JOHN) 

(SOT,  JOHN,  JOE) 

NULL 

(SOT,  DOT,  X) 

X  =  (JOHN,  JOEj 

(X,  DOT,  JOE) 

X  =  8OT 

(X,  DOT,  Z) 

X  =  SON,  Z  =  {JO 

(X,  DOT,  DOT) 

NULL 

The  basic  ideas  underlying  the  implementation  are  simple.  Consider 
first  the  primitive  questions  involving  one  variable  (numbers  2,  3,  ^). 

For  each  of  these  we  hash* code  the  specified  items  to  get  an  address  leading 
to  the  value  of  the  unspecified  one.  This  address  is  divided  into  &  page 
address  and  a  location  within  the  page.  Multiple  answers  are  kept  on  a 
linked  list  which  Is  required  to  be  entirely  within  the  page.  The  first 
primitive  question  (A,0,V)  can  be  answered  U3ing  any  of  three  pages;  the 
questions  involving  two  variables  axe  discus -ou  below. 

The  three  primitive  questions  with  *■  wo  variables  each  involve  only 
one  specified  item  and  so  the  answers  c-dd  be  represented  by  a  separate 
list  for  each  item.  The  complicated  structure  described  below  results  from 


20 


attempting  to  combine  paging,  hash- coding  and  lists  in  an  efficient 
manner. 

There  are  three  types  of  sections  each  of  which  is  associated  with 
one  of  the  positions  A,0, V  of  the  triple.  The  number  of  pages  of  each 
type  will  vary  with  the  size  of  the  problem.  We  will  consider  a  typical 
A-section  in  some  detail  and  then  show  how  to  extend  the  discussion  to 
0  and  V  sections. 

A  typical  A-section  contains  all  triples  having  items  numbered  n 
to  n+k  in  the  A  position  of  a  triple.  Let  us  consider  the  simple  case 
of  a  triple  =  (a, o,v)  being  placed  in  an  otherwise  empty  memory. 

The  high- order  bits  of  a  determine  the  proper  page  for  t.»  the  remainder 
of  a  is  hashed  with  o  to  give  a  location  in  the  page.  At  this  location 
a  cell  of  the  following  form  will  be  constructed 


o 

6 

CONFLICT 

a-USE 

V 

How  a,o,v  are  the  elements  of  t^  and  "6"  is  the  value  of  a  tag 
ahowing  the  type  of  our  cell  (qf.  Table  1).  The  CONFLICT  field  ia  used 
if  more  than  one  (a, o)  pair  hash  to  the  same  address.  The  a-USE  field 
ia  one  link  in  the  circular  list  of  all  uses  of  the  item  'a'  in  attribute 
position i  this  list  provides  all  the  answers  to  the  primitive  question 
(a,X,Z).  The  situation  can  get  considerably  more  complicated  as  shewn 
in  Figure  1.  Figure  1  depicts  a  segment  of  the  storage  of  an  A  type  page 
under  the  following  conditions.  The  three  pairs  (a^,  °^),  **»d 

(a^, o^)  all  hash-code  to  the  same  address,  i.e.  there  is  a  three  way  con- 
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flict.  The  associations  (triples)  represented  are: 


V 

<wV 

V 

V- 

In  addition,  the  triples  (a^,o1,vg)  and  (e^o^v^)  have  been 
used  as  ITEMS  in  other  associations;  the  internal  item  identifiers  assigned 
to  the  respective  triples  are  recorded  in  the  bottom  half  of  the  type  5  cells. 
The  meaning  of  each  of  the  different  cell  types  is  given  in  Table  1.  The 
situation  depicted  in  Figure  1  is  a  worst  case  and  will  occur  infrequently. 
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CELL  TYPE 
0 
1 
2 
3 

h 

5 

6 
7 


PURPOSE  OF  A  CELL  OF  THAT  TYPE 
A  two-register  free  block. 

Represents  a  TRIPLE  in  a  collection  of  TRIPLES 

A  one-register  free  block. 

Represents  a  collection  of  TRIPLES,  in  a 
"conflict"  situation 

Represents  a  collection  of  TRIPLES. 

Represents  a  TRIPLE  which  is  used  as  an  ITEM. 

Represents  a  TRIPLE, 

Represents  a  TRIPLE,  in  a  "conflict"  situation 


Table  1:  Cell  Ttypes,  and  the  Purpose  for  Each 


C 
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The  o  and  v  type  paces  ere  organized  in  a  similar  fashion.  Each  O-page 
contains  all  uses  in  the  "object"  position  of  some  set  of  items.  Also  on 
the  O-page  are  the  triples  ^,o,v)  for  the  objects  of  thiB  page  and  for 
all  v.  Each  V-page  contains  the  uses  of  a  set  of  items  in  the  "value" 
position  and  the  triples  (a,y,vj  for  all  a.  Since  we  require  that  all 
of  the  multiple  values  of  a  primitive  associative  question  be  contained 
in  one  page,  the  pages  will  have  to  be  of  variable  size.  This  is  consis¬ 
tent  with  the  conventions  of  the  executive  system  [11],  but  does  impose 
an  unusual  storage  allocation  scheme. 

The  storage  scheme  described  b'  -  is  designed  to  meet  the  requirements 
of  variable  size  pages  containing  three  intertwined  data  structures.  The 
three  structures  are:  The  USE  list,  the  hash-coded  triples,  and  the  free 
storage  list.  The  technique  used  is  a  variation  of  the  standard  idea  of 
allocating  list  space  and  free  storage  from  opposite  ends  of  a  block  of 
memory.  The  central  idea  is  the  "striped"  page;  we  divide  storage  into 
classes  based  on  the  low  order  bits;  the  particular  scheme  now  in  use  is 
described  in  Figure  2.  Every  set  of  eight  consecutive  registers  is 
divided  into  two  which  are  addressable  by  hash-coding,  one  for  the  heads 
of  USE  lists,  and  one  single  and  two  double  free  registers.  There  are 
many  other  possible  arrangements  and  one  could  design  a  system  which 
changed  the  storage  layout  automatically. 

Let  us  now  consider  the  overall  operation  of  the  storage  scheme  in 
answering  the  primitive  associative  questions.  The  three  questions  (2), 
(5);  (*0  are  answered  by  hash  coding  and  C ®J  respectively  and 

retrieving  the  set  of  answers  from  the  list  at  the  resulting  location  on 
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the  appropriate  A,V  or  0  page.  T*  questions  involving  only  one 

specified  item,  (5),  (6),  (7)  are  answered  directly  from  the  USE  list 
for  the  item  on  the  appropriate  A,  V  or  0  page;  the  answer  is  a  set  of 
0-V,  A-0  or  V-A  pairs.  The  fully  specified  association  (l)  can  be  tested 
as  either  (2),  (j),  or  (4)  depending  on  which  pages  are  in  main  memory. 

Thus  each  primitive  association  can  be  answered  in  one  page  access  and 
without  searching  unless  there  is  a  hash-coding  conflict.  There  are, 
in  addition,  commands  tc  insert  (MAKE)  and  delete  (ERASE)  triples  from 
the  universe.  The  system  also  creates  dummy  ITEMs  for  triples  used  in 
other  triples . 

As  the  reader  has  unioubtedly  discovered,  the  price  for  this  generality 
and  efficiency  is  storage  space.  Each  association  (triple)  is  represented 
on  an  A,  an  0  and  a  V  type  page;  this  requires  approximately  twice  (not 
thrice)  the  storage  of  our  earlier  scheme  [  8  ]  because  some  of  the  in¬ 
formation  is  implicit  in  the  address  at  which  the  triple  is  stored.  The 
major  justification  for  this  storage  redundancy  is  that  it  is  only  wasteful 
of  aecondary  storage;  the  main  memory  requirements  are  actually  smaller 
than  these  of  any  other  known  scheme  for  answering  all  the  primitive  asso¬ 
ciative  questions. 

There  are  some  additional  considerations  which  make  the  redundant 
storage  scheme  less  costly.  Representing  the  information  in  three  different 
ways  makes  it  possible  to  choose  different  page  distribution  strategies 
for  each  type.  The  system  does  not  normally  update  the  three  effected 
pages  when  a  MAKE  or  ERASE  is  executed;  it  merely  records  the  fact  that 
the  page  must  be  updated  the  next  time  it  is  brought  into  ii.ain  memory  to  be 
accessed.  An  obvious  addition  would  be  an  UPDATE  command  which  caused  the 
system  to  do  the  updating  when  time  was  available. 
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3.  Conclusion 


The  associative  programming  language,  LEAP  and  the  hash- coded 
associative  memory  system  have  been  in  active  use  since  early  1967 • 

Although  there  have  been  many  modifications  of  the  system,  the  basic 
design  has  remained  intact  and  appears  to  be  sound.  LEAP  has  become 
the  basic  applications  programming  language  for  TX-2,  although  many 
problems  do  not  require  an  associative  version.  The  associative  features 
have  bound  application  primarily  in  computer  graphics. 

We  will  briefly  describe  seme  typical  applications,  more  detailed 
information  can  be  requested  thre-  J.  the  authors.  One  application  is  an 
interactive  program  for  the  design  of  integrated  circuits  and  the  layout 
of  their  masks.  Another  involves  an  interactive  system  for  pole- zero  cal¬ 
culations  in  circuit  design.  There  is  also  a  program  for  displaying  con¬ 
strained  [33]  figures,  a  significant  non-procedural  problem.  There  are 
also  two  large  interactive  programming  systems  which  have  been  written  in 
LEAP.  The  first  is  a  system  for  synthesizing  animated  cartoons  from  a  set 
of  continuous  wave-forms  specifying  the  motion  of  parts  of  the  picture  [ 1 3 . 
The  second  is  the  recently  complete!  implementation  [31]  of  the  two-dimen- 
alonal  programming  language  Ambit/G  [4].  In  addition,  a  graphical  debug¬ 
ging  package  for  associative  structures  has  been  written. 

The  predominance  of  uses  in  interactive  graphics  systems  is  a  result 
of  the  interests  of  the  user  ccrnmunity.  While  LEAP  is  unabashedly  a  special 
purpose  language,  we  anticipate  applications  in  many  areas  of  question 
answering  (6,  25 1  and  artificial  intelligence.  There  are  continuations 
of  sane  of  the  LEAP  ideas  in  {17]  and  the  as  yet  unpublished  work  of  Allen 
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Johnson  at  MIT,  which 


Kay  at  Utah,  Edward  Sibley  at  and  Tin 

may  extend  the  application  areas. 

Our  experiences  with  the  LEAP  language  and  data  si  ucture  has 
suggested  several  possible  extensions  of  each.  The  complete  data  structure 
scheme  described  ir  Section  2  has  not  yet  been  used  to  nearly  its  capacity 
may  bo  unnecessarily  complicated  for  small  graphics  applications.  More 
particularly,  it  would  be  very  useful  to  develop  a  scheme  which  did  not 
require  a  page  access  to  ascertain  that  an  association  was  not  in  the 
Btore . 

In  general,  the  order  in  which  an  associative  context>  is  processed 
can  have  a  profound  effect  on  the  performance  of  the  system.  For  example, 
consider  two  equivalents  specification  of  the  brothers  of  bill. 

1)  father'  father (bill)  and  sex(x)  =  male 

2)  sex(x)  =  male  and  father'  father (bill) 

The  second  version  will  require  time  (and  space  for  correspondences) 
proportional  to  the  number  of  males  in  the  universe.  A  study  of  methods 
for  optimal  rearrangement  of  associative  statements  has  b  on  completed  [13] 
and  the  methods  developed  there  could  be  incorporated  in  the  compiler. 

In  the  associative  language,  as  in  most  high-level  language^  the 
full  form  of  certain  constructs  (,e.g.  <loop  statement>)  becomes  tedious; 
the  system  could  benefit  from  a  simple  macro  processor  [10],  The  system 
also  suffers  from  the  lack  of  item  and  set  functions,  although  procedures 
which  have  the  same  effect  are  available.  At  a  more  basic  level  one  would 
like  the  ability  to  manipulate  entire  correspondence  structures  (cf.  Section 
l.d.2).  Finally,  a  truly  complete  associative  system  would  include  a 
facility  for  implicitly  solving  for  associatior.3  which  are  not  explicitly 


27 


represented  in  the  structure;  thi^  ^quires  a  theorem  previne  program 
as  part  of  the  access  mechanism. 

The  study  of  software  associative  processing  is  expanding  rapidly 
and  there  are  a  large  number  of  associative  languages  under  development. 

Most  of  tnese  ore  less  ambitious  than  LEAP,  but  the  proposals  for  graphical 
(two-dimensional)  languages  [t,  32]  deserve  some  mention.  An  associative 
structure  is  quite-naturally  looked  upon  as  a  colored  directed  graph  and 
it  seems  that  associative  processing  is  a  natural  area  for  graphical  languages. 
After  an  early  attempt  to  extend  [3h]  to  an  associative  language,  we  de¬ 
cided  that  this  was  not  feasible  with  current  technology.  Ironically, 
the  only  currently  available  graphical  language  was  implemented  in  LEAP 
[31].  It  is  too  early  to  predict  the  usefulness  of  this  system  and,  in 
any  event,  the  problems  of  associative  processing  will  be  with  us  for 
save  time  to  come. 
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